xkcd commentary on p-values

xkcd commentary on p-values

Packages Needed

There are many packages to load. Here is a summary table of what each package does.

Package Purpose
car Anova() function to extract type III sums of squares
lme4 mixed models
nlme mixed models, non-linear models, alternative covariance structures
emmeans for extracting least squares means and contrasts
lmer test improved summary functions of lmer objects
dplyr data organization
forcats for managing categorical data
agridat has many agricultural data sets
agricolae has options for many common agricultural experimental designs
library(car)
library(lme4)
library(nlme)
library(emmeans) #note that this package also requires "multcompView" be installed (but it does not need to be loaded separately)
library(lmerTest)
library(dplyr)
library(forcats)
library(agridat)

A function for evaluating the number of missing data:

(we will use this later, but it is not actually related to ANOVA)

na_func <- function(df) {
  apply(df, 2, function(x) sum(is.na(x))) 
}

Introduction

ANOVA in R is a unfortunately a bit complicated. Unlike SAS, ANOVA functions in R lack a consistent structure, consistent output and the accessory packages for ANOVA display a patchwork of compatibility. The result is that it is easy to misspecify a model or make other mistakes. The information below is intended to serve as a guide through the R ANOVA wilderness.

Formula Notation

There are some consistent features across ANOVA methods in R. Formula notation is often used in the R syntax for ANOVA functions. It looks like this: $Y ~ X, where Y is the dependent variable (the response) and X is/are the independent variable(s) (e.g. the experimental treatments).

my_formula <- formula(Y ~ treatment1 + treatment2)
class(my_formula)
## [1] "formula"
my_formula
## Y ~ treatment1 + treatment2

Often the independent variables (i,e, the treatments or the x variables) are expected to be factors, another type of R object:

my_var <- c(rep("low",5), rep("high", 5))
class(my_var) #check what variable type it is
## [1] "character"
# since "my_var" is not type factor, it needs to be converted to a 
my_factor <- as.factor(my_var) # convert character variable to a factor
class(my_factor) # check variable type again to confirm
## [1] "factor"

Sometimes, there is a need to alter the order of treatment levels (well, how R sees those levels)

levels(my_factor) # see how R has ordered the factor levels and which levels are present
## [1] "high" "low"
# the default is alphabetical ordering
# the orders can be manually set:
my_factor <- factor(my_factor, levels = c("low", "high")) 

Knowing the level order is important because in the implementation of ANOVA in R, the first level is treated as the reference level.

More on formulas:

The formula first shown, Y ~ treatment1 + treatment2, includes main effects only. Other formula notation includes the symbols : and *.

formula(Y ~ treatment1:treatment2) # interaction only
## Y ~ treatment1:treatment2
formula(Y ~ treatment1*treatment2) # interaction plus main effects
## Y ~ treatment1 * treatment2

These two formulas are equivalent:

formula(Y ~ treatment1 + treatment2 + treatment1:treatment2) 
formula(Y ~ treatment1*treatment2) 

Perhaps you can see from these examples that formulas are a really just a collections of characters (that is, a string) and exist independent of any data set. Later, we will need to link these formulas to a data set to actually construct a linear model and conduct statistical analysis.

ANOVA for fixed effects models

Completely Randomised design

First, load the data set “warpbreaks” (a data set from base R). This is an old data set with variables for wool type (A and B) and tension on the loom (L, M or H). The response variable is “breaks”, the number of times the wool thread breaks on industrial looms.

I always like to have a quick look at the data before running any statistical tests. So, here we go:

data(warpbreaks)
na_func(warpbreaks)
##  breaks    wool tension 
##       0       0       0
str(warpbreaks)
## 'data.frame':    54 obs. of  3 variables:
##  $ breaks : num  26 30 54 25 70 52 51 26 67 18 ...
##  $ wool   : Factor w/ 2 levels "A","B": 1 1 1 1 1 1 1 1 1 1 ...
##  $ tension: Factor w/ 3 levels "L","M","H": 1 1 1 1 1 1 1 1 1 2 ...
table(warpbreaks$wool, warpbreaks$tension)
##    
##     L M H
##   A 9 9 9
##   B 9 9 9
hist(warpbreaks$breaks, col = "gold")

boxplot(breaks ~ wool, data = warpbreaks, col = "orangered")

boxplot(breaks ~ tension, data = warpbreaks, col = "chartreuse")

This data set has 2 treatments. We don’t know if there is an interaction between the variables, yet. A good start is to run a linear model using lm() function, the linear regression function. As a reminder, ANOVA is a special case of the linear regression model where the predictors (the experimental treatments) are categories rather than a continuous variable.

# run standard linear model for main effects only
lm.mod1 <- lm(breaks ~ wool + tension, data = warpbreaks)

# extract type III sums of squares from that model
Anova(lm.mod1, type = "III")
## Anova Table (Type III tests)
## 
## Response: breaks
##              Sum Sq Df  F value    Pr(>F)    
## (Intercept) 20827.0  1 154.3226 < 2.2e-16 ***
## wool          450.7  1   3.3393  0.073614 .  
## tension      2034.3  2   7.5367  0.001378 ** 
## Residuals    6747.9 50                       
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# run a linear model with main effects and interactions
lm.mod2 <- lm(breaks ~ wool*tension, data = warpbreaks)

# ...and type III sums of squares 
Anova(lm.mod2, type = "III")
## Anova Table (Type III tests)
## 
## Response: breaks
##               Sum Sq Df  F value    Pr(>F)    
## (Intercept)  17866.8  1 149.2757 2.426e-16 ***
## wool          1200.5  1  10.0301 0.0026768 ** 
## tension       2468.5  2  10.3121 0.0001881 ***
## wool:tension  1002.8  2   4.1891 0.0210442 *  
## Residuals     5745.1 48                       
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# FYI: 
# type II sums of squares: 
# Anova(lm.mod2, type = "II")
# type I sums of squares: 
# anova(lm.mod2)

A few comments on types of sums of squares:

As a reminder, the type of sums of squares used in statistical tests can impact the results and subsequent interpretation. Type I, sums of squares tests for statistical significance by adding one variable to the model at time (and hence is also called “sequential”). It is very dependent on the order variables are added to the model and hence is often not the best choice for many agricultural experiment. Type III sums of squares (also called “partial” or “marginal”) evaluates the statistical significance of variable or interaction, assuming that the other variables are in the model. This is a decent default option. The last option is Type II sums of squares, which is the best option when you are sure there are no interactions between variables.

Compare Models

# conduct an F test comparing the models
anova(lm.mod1, lm.mod2)
## Analysis of Variance Table
## 
## Model 1: breaks ~ wool + tension
## Model 2: breaks ~ wool * tension
##   Res.Df    RSS Df Sum of Sq      F  Pr(>F)  
## 1     50 6747.9                              
## 2     48 5745.1  2    1002.8 4.1891 0.02104 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# also, consider doing a stepwise approach for finding the best model:
step(lm.mod2)
## Start:  AIC=264.02
## breaks ~ wool * tension
## 
##                Df Sum of Sq    RSS    AIC
## <none>                      5745.1 264.02
## - wool:tension  2    1002.8 6747.9 268.71
## 
## Call:
## lm(formula = breaks ~ wool * tension, data = warpbreaks)
## 
## Coefficients:
##    (Intercept)           woolB        tensionM        tensionH  
##          44.56          -16.33          -20.56          -20.00  
## woolB:tensionM  woolB:tensionH  
##          21.11           10.56

Model diagnostics

plot(lm.mod2) #this will produce 4 plots of residuals

shapiro.test(resid(lm.mod2)) #standard shapiro-wilk test. interpret with a grain of salt
## 
##  Shapiro-Wilk normality test
## 
## data:  resid(lm.mod2)
## W = 0.98686, p-value = 0.8162

Least squares means & contrasts

# extract least squares means
(lsm <- emmeans(lm.mod2, ~ tension))
## NOTE: Results may be misleading due to involvement in interactions
##  tension   emmean      SE df lower.CL upper.CL
##  L       36.38889 2.57865 48 31.20417 41.57361
##  M       26.38889 2.57865 48 21.20417 31.57361
##  H       21.66667 2.57865 48 16.48194 26.85139
## 
## Results are averaged over the levels of: wool 
## Confidence level used: 0.95
# make all pairwise comparisons within each factor
contrast(lsm, "pairwise")
##  contrast  estimate       SE df t.ratio p.value
##  L - M    10.000000 3.646761 48   2.742  0.0229
##  L - H    14.722222 3.646761 48   4.037  0.0006
##  M - H     4.722222 3.646761 48   1.295  0.4049
## 
## Results are averaged over the levels of: wool 
## P value adjustment: tukey method for comparing a family of 3 estimates
# compare low tensions versus Medium and Medium
levels(warpbreaks$tension)
## [1] "L" "M" "H"
cList <- list(LvMH = c(1, -0.5, -0.5))

# check that contrast sums to zero
lapply(cList, sum)
## $LvMH
## [1] 0
# perform contrast
summary(contrast(lsm, cList, adjust = "none"))
##  contrast estimate       SE df t.ratio p.value
##  LvMH     12.36111 3.158188 48   3.914  0.0003
## 
## Results are averaged over the levels of: wool

Randomised Complete Block Design (RCBD) - fixed effects model

This example uses rapeseed yield data from multiple locations, years and cultivars. Within a single location or year, the replication is often balanced.

Load Data and examine:

data(shafii.rapeseed) 

rapeseed1987 <- shafii.rapeseed %>% filter(year == 87) %>% 
  mutate(block = fct_drop(rep), Cv = fct_drop(gen), loc = fct_drop(loc))

str(rapeseed1987)
## 'data.frame':    216 obs. of  7 variables:
##  $ year : int  87 87 87 87 87 87 87 87 87 87 ...
##  $ loc  : Factor w/ 9 levels "GGA","ID","MT",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ rep  : Factor w/ 4 levels "R1","R2","R3",..: 1 2 3 4 1 2 3 4 1 2 ...
##  $ gen  : Factor w/ 6 levels "Bienvenu","Bridger",..: 1 1 1 1 2 2 2 2 3 3 ...
##  $ yield: num  961 1329 1781 1698 1605 ...
##  $ block: Factor w/ 4 levels "R1","R2","R3",..: 1 2 3 4 1 2 3 4 1 2 ...
##  $ Cv   : Factor w/ 6 levels "Bienvenu","Bridger",..: 1 1 1 1 2 2 2 2 3 3 ...
na_func(rapeseed1987)
##  year   loc   rep   gen yield block    Cv 
##     0     0     0     0     0     0     0
table(rapeseed1987$Cv, rapeseed1987$loc) #experiment has 1 rep per block 
##           
##            GGA ID MT NC OR SC TGA TX WA
##   Bienvenu   4  4  4  4  4  4   4  4  4
##   Bridger    4  4  4  4  4  4   4  4  4
##   Cascade    4  4  4  4  4  4   4  4  4
##   Dwarf      4  4  4  4  4  4   4  4  4
##   Glacier    4  4  4  4  4  4   4  4  4
##   Jet        4  4  4  4  4  4   4  4  4
hist(rapeseed1987$yield, col = "gold")

boxplot(yield ~ Cv, data = rapeseed1987, col = "orangered")

boxplot(yield ~ loc, data = rapeseed1987, col = "chartreuse")

Analyse experiment:

# for this example, the analysis will only be done for a single year
# block is nested within location
# if each block had a unique name, 'Error(block)' would suffce
shaf.aov <- aov(yield ~ Cv*loc + Error(block) + Cv*loc, data = rapeseed1987)

summary(shaf.aov)
## 
## Error: block
##           Df Sum Sq Mean Sq F value Pr(>F)
## Residuals  3 336565  112188               
## 
## Error: Within
##            Df    Sum Sq  Mean Sq F value   Pr(>F)    
## Cv          5   3203992   640798   2.645 0.025111 *  
## loc         8 318197192 39774649 164.165  < 2e-16 ***
## Cv:loc     40  22707425   567686   2.343 0.000103 ***
## Residuals 159  38523267   242285                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
emmeans(shaf.aov, ~ Cv | loc)
## Note: re-fitting model with sum-to-zero contrasts
## loc = GGA:
##  Cv         emmean       SE     df lower.CL upper.CL
##  Bienvenu 1442.317 244.8854 161.14  958.719 1925.916
##  Bridger  1363.295 244.8854 161.14  879.696 1846.894
##  Cascade  1504.527 244.8854 161.14 1020.929 1988.126
##  Dwarf    1294.920 244.8854 161.14  811.321 1778.519
##  Glacier  1680.510 244.8854 161.14 1196.911 2164.109
##  Jet      1090.917 244.8854 161.14  607.319 1574.516
## 
## loc = ID:
##  Cv         emmean       SE     df lower.CL upper.CL
##  Bienvenu 1242.155 244.8854 161.14  758.556 1725.754
##  Bridger   946.671 244.8854 161.14  463.072 1430.269
##  Cascade   773.284 244.8854 161.14  289.685 1256.883
##  Dwarf     931.553 244.8854 161.14  447.954 1415.151
##  Glacier  1111.030 244.8854 161.14  627.431 1594.629
##  Jet      1064.030 244.8854 161.14  580.431 1547.629
## 
## loc = MT:
##  Cv         emmean       SE     df lower.CL upper.CL
##  Bienvenu 2616.000 244.8854 161.14 2132.401 3099.599
##  Bridger  2828.250 244.8854 161.14 2344.651 3311.849
##  Cascade  2916.250 244.8854 161.14 2432.651 3399.849
##  Dwarf    3451.750 244.8854 161.14 2968.151 3935.349
##  Glacier  3306.750 244.8854 161.14 2823.151 3790.349
##  Jet      3660.500 244.8854 161.14 3176.901 4144.099
## 
## loc = NC:
##  Cv         emmean       SE     df lower.CL upper.CL
##  Bienvenu 1000.962 244.8854 161.14  517.363 1484.560
##  Bridger  1064.289 244.8854 161.14  580.691 1547.888
##  Cascade   745.398 244.8854 161.14  261.800 1228.997
##  Dwarf    1013.852 244.8854 161.14  530.254 1497.451
##  Glacier  1229.343 244.8854 161.14  745.744 1712.941
##  Jet      1673.780 244.8854 161.14 1190.181 2157.379
## 
## loc = OR:
##  Cv         emmean       SE     df lower.CL upper.CL
##  Bienvenu 4555.895 244.8854 161.14 4072.296 5039.494
##  Bridger  2529.867 244.8854 161.14 2046.269 3013.466
##  Cascade  3335.792 244.8854 161.14 2852.194 3819.391
##  Dwarf    3931.832 244.8854 161.14 3448.234 4415.431
##  Glacier  4185.157 244.8854 161.14 3701.559 4668.756
##  Jet      3219.500 244.8854 161.14 2735.901 3703.099
## 
## loc = SC:
##  Cv         emmean       SE     df lower.CL upper.CL
##  Bienvenu 2499.750 244.8854 161.14 2016.151 2983.349
##  Bridger  2705.000 244.8854 161.14 2221.401 3188.599
##  Cascade  2118.750 244.8854 161.14 1635.151 2602.349
##  Dwarf    1893.750 244.8854 161.14 1410.151 2377.349
##  Glacier  2717.250 244.8854 161.14 2233.651 3200.849
##  Jet      2832.750 244.8854 161.14 2349.151 3316.349
## 
## loc = TGA:
##  Cv         emmean       SE     df lower.CL upper.CL
##  Bienvenu 1257.778 244.8854 161.14  774.180 1741.377
##  Bridger  1867.695 244.8854 161.14 1384.096 2351.294
##  Cascade  1707.827 244.8854 161.14 1224.229 2191.426
##  Dwarf     872.602 244.8854 161.14  389.004 1356.201
##  Glacier  1453.235 244.8854 161.14  969.636 1936.834
##  Jet       954.005 244.8854 161.14  470.406 1437.603
## 
## loc = TX:
##  Cv         emmean       SE     df lower.CL upper.CL
##  Bienvenu  838.000 244.8854 161.14  354.401 1321.599
##  Bridger  1069.000 244.8854 161.14  585.401 1552.599
##  Cascade   734.750 244.8854 161.14  251.151 1218.349
##  Dwarf     988.250 244.8854 161.14  504.651 1471.849
##  Glacier   951.750 244.8854 161.14  468.151 1435.349
##  Jet      1408.250 244.8854 161.14  924.651 1891.849
## 
## loc = WA:
##  Cv         emmean       SE     df lower.CL upper.CL
##  Bienvenu 4375.000 244.8854 161.14 3891.401 4858.599
##  Bridger  4603.750 244.8854 161.14 4120.151 5087.349
##  Cascade  4464.250 244.8854 161.14 3980.651 4947.849
##  Dwarf    3974.000 244.8854 161.14 3490.401 4457.599
##  Glacier  4740.000 244.8854 161.14 4256.401 5223.599
##  Jet      4344.250 244.8854 161.14 3860.651 4827.849
## 
## Confidence level used: 0.95

ANOVA for mixed models

(models with random and fixed effects)

Random effects are often those treatments levels drawn from a large population of possible treatment levels and there is interest in understanding the distribution and variance of that population. This in contrast to fixed effects, where the inferences are restricted to the treatment levels tested.

Blocking factors and Year are often considered random factors because a researcher is not interested in particular years or a blocking factor. When there is unbalanced replication, the variance components should be estimated with maximum likelihood or REML, which implies use of the packages “lmer” and/or “nlme”.

Randomised Complete Block Design (RCBD) - mixed effects

The “shafii.rapeseed” data set will be used for this section.

Analyse experiment using a mixed model:

This uses the function lme() from the package “nlme”. Functionally, it is very similar to calling lme4::lmer(). The degrees of freedom are different (lmer is using Satterthwaite’s approximation), but the p-values are the same.

# turn year into the factor "Year"
shafii.rapeseed$Year <- as.factor(shafii.rapeseed$year)
# create a blocking variable that is unique for each location-by-year combination
# so R doesn't conflate "R1" from one location/year with another location/year
shafii.rapeseed$Rep <- as.factor(paste(shafii.rapeseed$loc, shafii.rapeseed$year, shafii.rapeseed$rep, sep = "_"))

shaf.lme <- lme(fixed = yield ~ gen*loc + Year,
                  random = ~ 1|Rep,
                  data = shafii.rapeseed, method = "REML")

# view sum of squares table 
# when anova() is called for an lme object, the function called is actually anova.lme()
anova(shaf.lme, type = "marginal") # "marginal" is equivalent to type III sums of squares
##             numDF denDF   F-value p-value
## (Intercept)     1   470 16.204597  0.0001
## gen             5   470  1.092341  0.3637
## loc            13    92 13.074492  <.0001
## Year            2    92  2.035054  0.1365
## gen:loc        65   470  2.575753  <.0001
# FYI: use "anova(model.lme)" for type I sums of squares

# lmer notation
shaf.lmer <- lmer(yield ~ gen*loc + Year + (1|Rep),
                  data = shafii.rapeseed, REML = T)
anova(shaf.lmer, type = "marginal")
## Marginal Analysis of Variance Table with Satterthwaite's method
##           Sum Sq Mean Sq NumDF  DenDF F value    Pr(>F)    
## gen      1860586  372117     5 470.00  1.0923    0.3637    
## loc     57901483 4453960    13 159.37 13.0745 < 2.2e-16 ***
## Year     1386524  693262     2  92.00  2.0351    0.1365    
## gen:loc 57034691  877457    65 470.00  2.5758 5.499e-09 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Diagnostics, model building

plot(shaf.lme)

qqnorm(shaf.lme, abline = c(0, 1))

Least squares means

# for cultivar 
(lme.means.cv <- emmeans(shaf.lme, "gen"))
## NOTE: Results may be misleading due to involvement in interactions
##  gen        emmean       SE df lower.CL upper.CL
##  Bienvenu 2432.485 111.6229 92 2210.792 2654.178
##  Bridger  2313.914 111.6229 92 2092.221 2535.607
##  Cascade  2184.142 111.6229 92 1962.449 2405.835
##  Dwarf    2308.377 111.6229 92 2086.684 2530.070
##  Glacier  2463.481 111.6229 92 2241.788 2685.173
##  Jet      2303.783 111.6229 92 2082.091 2525.476
## 
## Results are averaged over the levels of: loc, Year 
## Degrees-of-freedom method: containment 
## Confidence level used: 0.95
# for location
(lme.means.loc <- emmeans(shaf.lme, "loc"))
## NOTE: Results may be misleading due to involvement in interactions
##  loc   emmean       SE df lower.CL upper.CL
##  GGA 1682.169 328.5723 92 1029.596 2334.742
##  ID  4217.051 261.3686 92 3697.950 4736.151
##  KS  1119.594 476.3386 92  173.544 2065.644
##  MS  2203.887 476.3386 92 1257.837 3149.936
##  MT  3338.808 473.7707 92 2397.858 4279.758
##  NC  1328.244 328.5723 92  675.671 1980.817
##  NY  3138.795 476.3386 92 2192.746 4084.845
##  OR  3292.090 328.5723 92 2639.517 3944.663
##  SC  1818.962 261.3686 92 1299.862 2338.063
##  TGA 1028.288 261.3686 92  509.187 1547.388
##  TN  2543.467 476.3386 92 1597.418 3489.517
##  TX   826.748 328.5723 92  174.175 1479.321
##  VA  2281.657 327.6428 92 1630.930 2932.384
##  WA  3861.333 261.3686 92 3342.233 4380.434
## 
## Results are averaged over the levels of: gen, Year 
## Degrees-of-freedom method: containment 
## Confidence level used: 0.95
# for cultivar means within each location
lme.means.int <- emmeans(shaf.lme, ~ gen | loc + Year)

# this code would produce location means within each cultivar 
# emmeans(model.lme, ~ loc | gen))
#  all possible combinations exist!

# also: 
# emmeans(model.lme, ~ loc | gen)) is the same as emmeans(model.lme, ~ gen | loc))

Pairwise Contrasts:

# all pairwise
pairs(lme.means.cv)
##  contrast             estimate       SE  df t.ratio p.value
##  Bienvenu - Bridger  118.57068 87.61532 470   1.353  0.7548
##  Bienvenu - Cascade  248.34262 87.61532 470   2.834  0.0539
##  Bienvenu - Dwarf    124.10773 87.61532 470   1.417  0.7170
##  Bienvenu - Glacier  -30.99567 87.61532 470  -0.354  0.9993
##  Bienvenu - Jet      128.70157 87.61532 470   1.469  0.6843
##  Bridger - Cascade   129.77193 87.61532 470   1.481  0.6765
##  Bridger - Dwarf       5.53704 87.61532 470   0.063  1.0000
##  Bridger - Glacier  -149.56635 87.61532 470  -1.707  0.5277
##  Bridger - Jet        10.13088 87.61532 470   0.116  1.0000
##  Cascade - Dwarf    -124.23489 87.61532 470  -1.418  0.7161
##  Cascade - Glacier  -279.33829 87.61532 470  -3.188  0.0190
##  Cascade - Jet      -119.64105 87.61532 470  -1.366  0.7477
##  Dwarf - Glacier    -155.10340 87.61532 470  -1.770  0.4861
##  Dwarf - Jet           4.59384 87.61532 470   0.052  1.0000
##  Glacier - Jet       159.69724 87.61532 470   1.823  0.4521
## 
## Results are averaged over the levels of: loc, Year 
## P value adjustment: tukey method for comparing a family of 6 estimates
# HSD-like comparisons:

CLD(lme.means.cv)
##  gen        emmean       SE df lower.CL upper.CL .group
##  Cascade  2184.142 111.6229 92 1962.449 2405.835  1    
##  Jet      2303.783 111.6229 92 2082.091 2525.476  12   
##  Dwarf    2308.377 111.6229 92 2086.684 2530.070  12   
##  Bridger  2313.914 111.6229 92 2092.221 2535.607  12   
##  Bienvenu 2432.485 111.6229 92 2210.792 2654.178  12   
##  Glacier  2463.481 111.6229 92 2241.788 2685.173   2   
## 
## Results are averaged over the levels of: loc, Year 
## Degrees-of-freedom method: containment 
## Confidence level used: 0.95 
## P value adjustment: tukey method for comparing a family of 6 estimates 
## significance level used: alpha = 0.05
CLD(lme.means.loc)
##  loc   emmean       SE df lower.CL upper.CL .group 
##  TX   826.748 328.5723 92  174.175 1479.321  1     
##  TGA 1028.288 261.3686 92  509.187 1547.388  1     
##  KS  1119.594 476.3386 92  173.544 2065.644  123   
##  NC  1328.244 328.5723 92  675.671 1980.817  12    
##  GGA 1682.169 328.5723 92 1029.596 2334.742  1234  
##  SC  1818.962 261.3686 92 1299.862 2338.063  123   
##  MS  2203.887 476.3386 92 1257.837 3149.936  12345 
##  VA  2281.657 327.6428 92 1630.930 2932.384  1234  
##  TN  2543.467 476.3386 92 1597.418 3489.517  123456
##  NY  3138.795 476.3386 92 2192.746 4084.845   23456
##  OR  3292.090 328.5723 92 2639.517 3944.663     456
##  MT  3338.808 473.7707 92 2397.858 4279.758    3456
##  WA  3861.333 261.3686 92 3342.233 4380.434      56
##  ID  4217.051 261.3686 92 3697.950 4736.151       6
## 
## Results are averaged over the levels of: gen, Year 
## Degrees-of-freedom method: containment 
## Confidence level used: 0.95 
## P value adjustment: tukey method for comparing a family of 14 estimates 
## significance level used: alpha = 0.05
# CLD(lme.means.int) # not actually run because it results in copious output

# plot same results
plot(lme.means.cv, comparison = T)

plot(lme.means.loc, comparison = T, horizontal = F) # rotate plots to vertical position
## Comparison discrepancy in group 1, GGA - OR:
##     Target overlap = 0.0083, overlap on graph = -0.0111

# blue bars = lsmeans confidence 95% confidence intervals
# red arrows. pairwise differnces (overlapping arrows = not significantly different)

Interaction plots can also be done:

(but, it gets unwieldy)

plot(lme.means.int, comparison = T, adjust = "tukey")

Other preset contrasts:

# compare to a control, e.g. "Bridger"
levels(shafii.rapeseed$gen)
## [1] "Bienvenu" "Bridger"  "Cascade"  "Dwarf"    "Glacier"  "Jet"
# Bridger is listed in position 2 of the factor 'shafii.rapeseed$gen'
# so '2' is set as the reference level in the following contrast statement: 

# "trt.vs.ctrlk" (treatment versus control treatment k) is a specific option to compare all treatment levels to a user-defined level
# by default, it will use the last level as the reference level

contrast(lme.means.cv, "trt.vs.ctrlk", ref = 2) 
##  contrast             estimate       SE  df t.ratio p.value
##  Bienvenu - Bridger  118.57068 87.61532 470   1.353  0.5118
##  Cascade - Bridger  -129.77193 87.61532 470  -1.481  0.4315
##  Dwarf - Bridger      -5.53704 87.61532 470  -0.063  0.9998
##  Glacier - Bridger   149.56635 87.61532 470   1.707  0.3034
##  Jet - Bridger       -10.13088 87.61532 470  -0.116  0.9990
## 
## Results are averaged over the levels of: loc, Year 
## P value adjustment: dunnettx method for 5 tests

Search ?contrast.emmGrid to see full list of options for preset contrasts.

Custom constrasts

# example: contrast Western locations versus Southern locations

# first, find out what levels are present
unique(shafii.rapeseed$loc)
##  [1] GGA ID  KS  MS  MT  NC  NY  OR  SC  TGA TN  TX  VA  WA 
## Levels: GGA ID KS MS MT NC NY OR SC TGA TN TX VA WA
# next create a contrast list 
# this is a list of coefficients as long your list of treatement levels
# indicating what coefficients to give each treatment level

# in this example, levels "ID", "MT", "OR", and "WA" are contrasted versus
# "NC", "SC", "MS", "TN", "TX" and "VA"

cList <- list(West_V_South = c(0, 1/4, 0, -1/6, 1/4, -1/6, 0, 1/4, -1/6, 0, -1/6, -1/6, -1/6, 1/4))

# check that each contrast sums to zero:
lapply(cList, sum)
## $West_V_South
## [1] 5.551115e-17
lme.means.loc2 <- emmeans(shaf.lme, "loc", contr = cList)
## NOTE: Results may be misleading due to involvement in interactions
summary(lme.means.loc2)
## $emmeans
##  loc   emmean       SE df lower.CL upper.CL
##  GGA 1682.169 328.5723 92 1029.596 2334.742
##  ID  4217.051 261.3686 92 3697.950 4736.151
##  KS  1119.594 476.3386 92  173.544 2065.644
##  MS  2203.887 476.3386 92 1257.837 3149.936
##  MT  3338.808 473.7707 92 2397.858 4279.758
##  NC  1328.244 328.5723 92  675.671 1980.817
##  NY  3138.795 476.3386 92 2192.746 4084.845
##  OR  3292.090 328.5723 92 2639.517 3944.663
##  SC  1818.962 261.3686 92 1299.862 2338.063
##  TGA 1028.288 261.3686 92  509.187 1547.388
##  TN  2543.467 476.3386 92 1597.418 3489.517
##  TX   826.748 328.5723 92  174.175 1479.321
##  VA  2281.657 327.6428 92 1630.930 2932.384
##  WA  3861.333 261.3686 92 3342.233 4380.434
## 
## Results are averaged over the levels of: gen, Year 
## Degrees-of-freedom method: containment 
## Confidence level used: 0.95 
## 
## $contrasts
##  contrast     estimate       SE df t.ratio p.value
##  West_V_South 1843.493 233.0729 92   7.910  <.0001
## 
## Results are averaged over the levels of: gen, Year
# same contrast can also be done within each level of Cv:
emmeans(shaf.lme, ~ loc | gen, contr = cList)
## $emmeans
## gen = Bienvenu:
##  loc   emmean       SE df lower.CL upper.CL
##  GGA 1784.523 378.7416 92 1032.309 2536.736
##  ID  4742.446 303.2664 92 4140.133 5344.759
##  KS  1178.552 545.7741 92   94.598 2262.507
##  MS  2455.425 545.7741 92 1371.470 3539.379
##  MT  2824.891 543.5344 92 1745.385 3904.398
##  NC  1329.577 378.7416 92  577.363 2081.790
##  NY  2933.641 545.7741 92 1849.687 4017.596
##  OR  4118.193 378.7416 92 3365.980 4870.407
##  SC  1843.733 303.2664 92 1241.419 2446.046
##  TGA  893.306 303.2664 92  290.992 1495.619
##  TN  2964.549 545.7741 92 1880.594 4048.503
##  TX   919.252 378.7416 92  167.038 1671.466
##  VA  2123.952 377.9355 92 1373.339 2874.564
##  WA  3942.750 303.2664 92 3340.437 4545.063
## 
## gen = Bridger:
##  loc   emmean       SE df lower.CL upper.CL
##  GGA 1470.389 378.7416 92  718.175 2222.602
##  ID  3591.466 303.2664 92 2989.153 4193.779
##  KS  1091.302 545.7741 92    7.348 2175.257
##  MS  2477.845 545.7741 92 1393.890 3561.799
##  MT  3037.141 543.5344 92 1957.635 4116.648
##  NC  1479.492 378.7416 92  727.279 2231.706
##  NY  3129.554 545.7741 92 2045.599 4213.508
##  OR  2564.206 378.7416 92 1811.992 3316.419
##  SC  2281.892 303.2664 92 1679.579 2884.206
##  TGA 1602.976 303.2664 92 1000.663 2205.290
##  TN  2485.286 545.7741 92 1401.332 3569.241
##  TX   851.291 378.7416 92   99.078 1603.505
##  VA  2397.374 377.9355 92 1646.762 3147.987
##  WA  3934.583 303.2664 92 3332.270 4536.897
## 
## gen = Cascade:
##  loc   emmean       SE df lower.CL upper.CL
##  GGA 1758.460 378.7416 92 1006.247 2510.674
##  ID  4081.356 303.2664 92 3479.042 4683.669
##  KS   890.552 545.7741 92 -193.402 1974.507
##  MS  1597.995 545.7741 92  514.040 2681.949
##  MT  3125.141 543.5344 92 2045.635 4204.648
##  NC  1061.822 378.7416 92  309.608 1814.035
##  NY  2586.169 545.7741 92 1502.214 3670.123
##  OR  2806.038 378.7416 92 2053.825 3558.252
##  SC  1982.013 303.2664 92 1379.699 2584.326
##  TGA 1492.147 303.2664 92  889.833 2094.460
##  TN  2006.321 545.7741 92  922.367 3090.276
##  TX   795.807 378.7416 92   43.593 1548.020
##  VA  2191.172 377.9355 92 1440.559 2941.784
##  WA  4203.000 303.2664 92 3600.687 4805.313
## 
## gen = Dwarf:
##  loc   emmean       SE df lower.CL upper.CL
##  GGA 1537.924 378.7416 92  785.710 2290.137
##  ID  4326.120 303.2664 92 3723.807 4928.433
##  KS  1207.802 545.7741 92  123.848 2291.757
##  MS  1965.647 545.7741 92  881.693 3049.602
##  MT  3660.641 543.5344 92 2581.135 4740.148
##  NC  1320.748 378.7416 92  568.534 2072.961
##  NY  3645.214 545.7741 92 2561.259 4729.168
##  OR  3593.612 378.7416 92 2841.398 4345.825
##  SC  1292.411 303.2664 92  690.098 1894.724
##  TGA  450.502 303.2664 92 -151.811 1052.815
##  TN  2687.529 545.7741 92 1603.574 3771.483
##  TX   653.570 378.7416 92  -98.644 1405.783
##  VA  2249.727 377.9355 92 1499.115 3000.340
##  WA  3725.833 303.2664 92 3123.520 4328.147
## 
## gen = Glacier:
##  loc   emmean       SE df lower.CL upper.CL
##  GGA 2030.559 378.7416 92 1278.345 2782.772
##  ID  4298.921 303.2664 92 3696.608 4901.234
##  KS  1267.802 545.7741 92  183.848 2351.757
##  MS  2860.772 545.7741 92 1776.818 3944.727
##  MT  3515.641 543.5344 92 2436.135 4595.148
##  NC  1452.032 378.7416 92  699.818 2204.245
##  NY  3301.441 545.7741 92 2217.487 4385.396
##  OR  3471.574 378.7416 92 2719.361 4223.788
##  SC  2025.279 303.2664 92 1422.966 2627.592
##  TGA 1109.218 303.2664 92  506.905 1711.531
##  TN  2264.536 545.7741 92 1180.582 3348.491
##  TX   720.368 378.7416 92  -31.846 1472.581
##  VA  2363.251 377.9355 92 1612.638 3113.863
##  WA  3807.333 303.2664 92 3205.020 4409.647
## 
## gen = Jet:
##  loc   emmean       SE df lower.CL upper.CL
##  GGA 1511.161 378.7416 92  758.948 2263.375
##  ID  4261.996 303.2664 92 3659.683 4864.309
##  KS  1081.552 545.7741 92   -2.402 2165.507
##  MS  1865.635 545.7741 92  781.680 2949.589
##  MT  3869.391 543.5344 92 2789.885 4948.898
##  NC  1325.793 378.7416 92  573.579 2078.006
##  NY  3236.754 545.7741 92 2152.799 4320.708
##  OR  3198.914 378.7416 92 2446.701 3951.128
##  SC  1488.445 303.2664 92  886.132 2090.758
##  TGA  621.578 303.2664 92   19.265 1223.891
##  TN  2852.581 545.7741 92 1768.627 3936.536
##  TX  1020.200 378.7416 92  267.987 1772.414
##  VA  2364.466 377.9355 92 1613.853 3115.078
##  WA  3554.500 303.2664 92 2952.187 4156.813
## 
## Results are averaged over the levels of: Year 
## Degrees-of-freedom method: containment 
## Confidence level used: 0.95 
## 
## $contrasts
## gen = Bienvenu:
##  contrast     estimate       SE df t.ratio p.value
##  West_V_South 1967.656 267.3775 92   7.359  <.0001
## 
## gen = Bridger:
##  contrast     estimate       SE df t.ratio p.value
##  West_V_South 1286.319 267.3775 92   4.811  <.0001
## 
## gen = Cascade:
##  contrast     estimate       SE df t.ratio p.value
##  West_V_South 1948.029 267.3775 92   7.286  <.0001
## 
## gen = Dwarf:
##  contrast     estimate       SE df t.ratio p.value
##  West_V_South 2131.613 267.3775 92   7.972  <.0001
## 
## gen = Glacier:
##  contrast     estimate       SE df t.ratio p.value
##  West_V_South 1825.661 267.3775 92   6.828  <.0001
## 
## gen = Jet:
##  contrast     estimate       SE df t.ratio p.value
##  West_V_South 1901.680 267.3775 92   7.112  <.0001
## 
## Results are averaged over the levels of: Year

To conduct custom contrasts on a another variable (e.g. “loc”), that needs its own cList and emmeans statement.

ANCOVA

(analysis of covariance) From a R programming perspective, this is no different than running a standard linear model. A data set from agridat, “theobald covariate” comparing corn silage yields across multiple years, locations and cultivars. The data set includes a covariate, “chu” (corn heat units, a bit like growing degree days).

Load data and examine:

data(theobald.covariate)
str(theobald.covariate)
## 'data.frame':    256 obs. of  5 variables:
##  $ year : int  1990 1990 1990 1990 1990 1991 1991 1991 1991 1991 ...
##  $ env  : Factor w/ 7 levels "E1","E2","E3",..: 1 2 3 4 7 1 2 3 4 5 ...
##  $ gen  : Factor w/ 10 levels "G01","G02","G03",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ yield: num  6.27 5.57 8.45 7.35 6.5 6.71 5.59 8.36 7.25 8.09 ...
##  $ chu  : num  2.57 2.53 2.72 2.72 2.48 2.44 2.55 2.75 2.75 2.61 ...
na_func(theobald.covariate)
##  year   env   gen yield   chu 
##     0     0     0     0     0

Exploratory plots:

# distributions of continuous variables
hist(theobald.covariate$yield, col = "gold")

hist(theobald.covariate$chu, col = "gray70")

# relationship between reponse variable and covariate:
with(theobald.covariate, plot(chu, yield))

length(unique(theobald.covariate$chu))
## [1] 21
# the usual boxplots: 
boxplot(yield ~ env, data = theobald.covariate, col = "orangered")

boxplot(yield ~ year, data = theobald.covariate, col = "chartreuse")

boxplot(yield ~ gen, data = theobald.covariate, col = "darkcyan")

Check level of replication:

theobald.covariate$Year <- as.factor(theobald.covariate$year)
replications(yield ~ Year + env + gen, data = theobald.covariate)
## $Year
## Year
## 1990 1991 1992 1993 1994 
##   40   63   60   45   48 
## 
## $env
## env
## E1 E2 E3 E4 E5 E6 E7 
## 35 35 44 36 36 36 34 
## 
## $gen
## gen
## G01 G02 G03 G04 G05 G06 G07 G08 G09 G10 
##  29  29  29  29  22  29  23  18  24  24
# with(theobald.covariate, table(gen, env, Year)) # lots of useful output

The treatments are not fully crossed, so a fully specified model of the form yield ~ Year*env*gen*chu cannot be tested. The treatments and interactions were tested in reduced models and compared. The final “best” model is shown below.

# the covariate, chu, is added in like any other effect. 
theobald.lm2 <- lm(yield ~  Year + env*chu, data = theobald.covariate)
Anova(theobald.lm2, type = "III")
## Anova Table (Type III tests)
## 
## Response: yield
##              Sum Sq  Df F value    Pr(>F)    
## (Intercept)   4.309   1  6.8321  0.009524 ** 
## Year         76.589   4 30.3607 < 2.2e-16 ***
## env          13.473   6  3.5607  0.002138 ** 
## chu          11.831   1 18.7596 2.187e-05 ***
## env:chu      13.376   6  3.5350  0.002268 ** 
## Residuals   150.096 238                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# how to extract the covariate slope(s): 
emtrends(theobald.lm2, ~ env, "chu")
##  env chu.trend       SE  df  lower.CL  upper.CL
##  E1   7.014988 1.619627 238  3.824352 10.205624
##  E2   0.978518 4.437201 238 -7.762686  9.719722
##  E3   4.099310 3.152818 238 -2.111683 10.310303
##  E4  -2.884280 3.543760 238 -9.865421  4.096861
##  E5   8.222358 2.700385 238  2.902649 13.542066
##  E6   3.424701 2.720563 238 -1.934759  8.784161
##  E7  -0.358948 2.548834 238 -5.380105  4.662209
## 
## Results are averaged over the levels of: Year 
## Confidence level used: 0.95
# emmeans extracted as usual:
emmeans(theobald.lm2, ~ env)
## NOTE: Results may be misleading due to involvement in interactions
##  env   emmean        SE  df lower.CL upper.CL
##  E1  6.666544 0.1749185 238 6.321958 7.011131
##  E2  5.134968 0.2563050 238 4.630052 5.639884
##  E3  6.658226 0.4819422 238 5.708809 7.607644
##  E4  7.224342 0.5075654 238 6.224447 8.224236
##  E5  6.608382 0.1384270 238 6.335684 6.881081
##  E6  6.430967 0.2359627 238 5.966125 6.895809
##  E7  6.318851 0.3972573 238 5.536262 7.101440
## 
## Results are averaged over the levels of: Year 
## Confidence level used: 0.95
emmeans(theobald.lm2, ~ Year)
##  Year   emmean        SE  df lower.CL upper.CL
##  1990 6.969702 0.1885709 238 6.598221 7.341183
##  1991 6.746088 0.1697058 238 6.411771 7.080405
##  1992 7.070521 0.1872254 238 6.701690 7.439351
##  1993 5.390109 0.2080069 238 4.980339 5.799878
##  1994 5.996638 0.2183068 238 5.566578 6.426699
## 
## Results are averaged over the levels of: env 
## Confidence level used: 0.95

ANOVA - Special Designs

  • split plot
  • split-split plot (see the agricolae package)
  • split-block (see the agricolae package)
  • lattice (see the agricolae package)

Split-plot

Load data Load “Oats” from nlme. Nitrogen level (“nitro”) is the main plot, cultivar (“Variety”) is the sub-plot and “Block” describes the blocking design.

data(Oats) 
str(Oats)
## Classes 'nfnGroupedData', 'nfGroupedData', 'groupedData' and 'data.frame':   72 obs. of  4 variables:
##  $ Block  : Ord.factor w/ 6 levels "VI"<"V"<"III"<..: 6 6 6 6 6 6 6 6 6 6 ...
##  $ Variety: Factor w/ 3 levels "Golden Rain",..: 3 3 3 3 1 1 1 1 2 2 ...
##  $ nitro  : num  0 0.2 0.4 0.6 0 0.2 0.4 0.6 0 0.2 ...
##  $ yield  : num  111 130 157 174 117 114 161 141 105 140 ...
##  - attr(*, "formula")=Class 'formula'  language yield ~ nitro | Block
##   .. ..- attr(*, ".Environment")=<environment: R_GlobalEnv> 
##  - attr(*, "labels")=List of 2
##   ..$ y: chr "Yield"
##   ..$ x: chr "Nitrogen concentration"
##  - attr(*, "units")=List of 2
##   ..$ y: chr "(bushels/acre)"
##   ..$ x: chr "(cwt/acre)"
##  - attr(*, "inner")=Class 'formula'  language ~Variety
##   .. ..- attr(*, ".Environment")=<environment: R_GlobalEnv>
na_func(Oats)
##   Block Variety   nitro   yield 
##       0       0       0       0
Oats$N <- as.factor(Oats$nitro)
replications(yield ~ Variety*N*Block, data = Oats)
##         Variety               N           Block       Variety:N 
##              24              18              12               6 
##   Variety:Block         N:Block Variety:N:Block 
##               4               3               1
table(Oats$Variety, Oats$N)
##              
##               0 0.2 0.4 0.6
##   Golden Rain 6   6   6   6
##   Marvellous  6   6   6   6
##   Victory     6   6   6   6
hist(Oats$yield, col = "gold")

boxplot(yield ~ N, data = Oats, col = "dodgerblue1")

boxplot(yield ~ Variety, data = Oats, col = "red3")

Analyse data:

The format for specifying split-plot error terms is Error(blocking factor/main plot).

splot.oats <- aov(yield ~ Variety*nitro + Error(Block/Variety), data = Oats) 
summary(splot.oats)
## 
## Error: Block
##           Df Sum Sq Mean Sq F value Pr(>F)
## Residuals  5  15875    3175               
## 
## Error: Block:Variety
##           Df Sum Sq Mean Sq F value Pr(>F)
## Variety    2   1786   893.2   1.485  0.272
## Residuals 10   6013   601.3               
## 
## Error: Within
##               Df Sum Sq Mean Sq F value Pr(>F)    
## nitro          1  19536   19536 115.771  1e-14 ***
## Variety:nitro  2    168      84   0.499   0.61    
## Residuals     51   8606     169                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
emtrends(splot.oats, ~ Variety, "nitro") # slopes
## Note: re-fitting model with sum-to-zero contrasts
## Note: re-fitting model with sum-to-zero contrasts
##  Variety     nitro.trend       SE df lower.CL  upper.CL
##  Golden Rain    75.33333 11.85854 51 51.52632  99.14035
##  Marvellous     64.58333 11.85854 51 40.77632  88.39035
##  Victory        81.08333 11.85854 51 57.27632 104.89035
## 
## Confidence level used: 0.95
emmeans(splot.oats, ~ Variety) # emmeans
## Note: re-fitting model with sum-to-zero contrasts
## NOTE: Results may be misleading due to involvement in interactions
##  Variety       emmean       SE    df lower.CL upper.CL
##  Golden Rain 127.1000 8.570757 12.85 108.5618 145.6382
##  Marvellous  129.1667 8.570757 12.85 110.6285 147.7048
##  Victory     121.9500 8.570757 12.85 103.4118 140.4882
## 
## Confidence level used: 0.95

Extra Functions

for extracting model parameters, diagnostics and other model information

These work differently with different R object types. That is, different output will result depending on if a “lm”, “lme” or “merMod” (lmer) object is used in the function call.

# extract model summary
summary()

#extract coefficients:
coef()

#extract residuals
resid()
rstudent()
residuals()

# extract predicted values
fits()

# make diagnostic plots
plot()

# extract influence measures:
influence.measures()

#other fir diagnostics:
cooks.distance()
dffits()
dfbeta()
hat()

To see the all functions available for a particular type of linear model object, use:

methods(class = "lm") # for lm objects
##  [1] add1                alias               anova              
##  [4] Anova               avPlot              Boot               
##  [7] bootCase            boxCox              brief              
## [10] case.names          ceresPlot           coerce             
## [13] confidenceEllipse   confint             Confint            
## [16] cooks.distance      crPlot              deltaMethod        
## [19] deviance            dfbeta              dfbetaPlots        
## [22] dfbetas             dfbetasPlots        drop1              
## [25] dummy.coef          durbinWatsonTest    effects            
## [28] emm_basis           extractAIC          family             
## [31] formula             hatvalues           hccm               
## [34] infIndexPlot        influence           influencePlot      
## [37] initialize          inverseResponsePlot kappa              
## [40] labels              leveneTest          leveragePlot       
## [43] linearHypothesis    logLik              mcPlot             
## [46] mmp                 model.frame         model.matrix       
## [49] ncvTest             nextBoot            nobs               
## [52] outlierTest         plot                powerTransform     
## [55] predict             Predict             print              
## [58] proj                qqnorm              qqPlot             
## [61] qr                  recover_data        residualPlot       
## [64] residualPlots       residuals           rstandard          
## [67] rstudent            S                   show               
## [70] sigmaHat            simulate            slotsFromS3        
## [73] spreadLevelPlot     summary             variable.names     
## [76] vcov               
## see '?methods' for accessing help and source code
methods(class = "lme") # for lme objects
##  [1] ACF              anova            Anova            augPred         
##  [5] coef             comparePred      confint          Confint         
##  [9] deltaMethod      deviance         emm_basis        extractAIC      
## [13] fitted           fixef            formula          getData         
## [17] getGroups        getGroupsFormula getResponse      getVarCov       
## [21] influence        intervals        linearHypothesis logLik          
## [25] matchCoefs       nobs             pairs            plot            
## [29] predict          print            qqnorm           ranef           
## [33] recover_data     residuals        S                sigma           
## [37] simulate         summary          update           VarCorr         
## [41] Variogram        vcov            
## see '?methods' for accessing help and source code
methods(class = "merMod") # for lmer objects
##  [1] anova            Anova            as.function      coef            
##  [5] confint          cooks.distance   deltaMethod      deviance        
##  [9] df.residual      drop1            emm_basis        extractAIC      
## [13] family           fitted           fixef            formula         
## [17] getL             getME            hatvalues        influence       
## [21] isGLMM           isLMM            isNLMM           isREML          
## [25] linearHypothesis logLik           matchCoefs       model.frame     
## [29] model.matrix     ngrps            nobs             plot            
## [33] predict          print            profile          ranef           
## [37] recover_data     refit            refitML          rePCA           
## [41] residuals        rstudent         show             sigma           
## [45] simulate         summary          terms            update          
## [49] VarCorr          vcov             vif              weights         
## see '?methods' for accessing help and source code

The package “emmeans” also supports a large number of models.